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401 



RESCALE MASS SPECTRUM DATA SO THAT PERIOD 

f 



I 



FIG. s 



403 



INTERPOLATE COUNTS TO GENERATE EVENLY SPACED DATA 
TO ASSURE THERE ARE SUBSTANTIALLY NUMBER OF 
FOR ALL BLOCKS IN THE MASS SPECTRUM DATA) 
FOR EXAMPLE, CALCULATE 
COUNTS 

(INTERPOLATED) ~ 



\ intp -t(W)uo^\couNTs\ -COUNTS \ 1 
L_ JL \high \lowj 



+COUNTS 



Vow 



NORMALIZE COUNTS (TO SUBTRACT OUT ERROR/ 
NOISE IN BASELINE) 
FOR EXAMPLE. CALCULATE 

^^NORUMJZBT^OHTB^m,)-™™ \ L0CAL m 






DEVELOP AVERAGE FILTER (DECONVOLUTION) KERNEL 

r N y BL0CKS COUNTS R> 
COUNTS L= X , MOM 

[kern Lm (coums^ -COUNTS^) J 


/ N BLOCKS 



405 



407 



N BINS , 

DETERMINE KERNEL ERROR — ERROR=% <j' KERN 
AND ITERATE ON f TO MINIMIZE ERROR 



T 



^409 



FILTER OPTIMAL KERNEL FORM COUNTS, e.g. 
COUNTS f -COUNTS f -CONSTANT X COUNTS? 
[FILTERED \nORM \kERN 



+ 



com%[ 



OBTAIN PROTEIN EXTRACT 
. A CELLULAR/TISSUE EXTRACT 
WING MORE THAN 100 PROTEINS) 



: 

Q 



o 



□ 



201 



203 



LABEL PROTEINS WITH COVALENT 
MASS LABEL 



SEPARATE LABELLED PROTEINS 
(e.g. USE ELECTROPHORESIS — 
ONE OR TWO DIMENSIONAL DEPENDING 
ON NUMBER OF PROTEINS IN SAMPLE; 
CAPILLARY OR GEL ELECTROPHORESIS 
MAY BE USED DEPENDING ON 
MASS LABEL USED) 



207 



DETERMINE COMPLETE OR PARTIAL PROTEIN 
SEQUENCE FOR EACH SEPARATED LABELLED 
PROTEIN BY MASS SPECTROMETRY (e.g. 
ELECTROSPRAY USED IF PROTEINS ARE 
SEPARATED BY CAPILLARY ELECTROPHORESIS; 
MALDI USED IF PROTEINS ARE SEPARATED 
BY GEL ELECTROPHORESIS) 



FIG. \ \ 



+ 



251 



LABEL PR0TEINS/P0LYPEPT1DES 
(e.g. LABEL ONE OR MANY EXTRACTED 
PROTEINS WITH A MASS LABEL) AND 
ISOLATE EACH LABELLED PROTEIN/ 
POLYPEPTIDE 



^253 



PERFORM COLLISION INDUCED, IN SOURCE, 
MASS SPECTROMETRY FOR EACH ISOLATED 
PROTEIN/POL YPEPTIDE 



255 



TRANSMIT DATA SAMPLE REPRESENTING 
MASS SPECTRUM FROM MASS SPECTROMETER 
TO PROCESSING SYSTEM 



FILTER DATA TO REMOVE PERIODIC NOISE 



PROCESS FILTERED DATA AUTOMATICALLY 
TO OBTAIN PROTEIN SEQUENCE (OR PORTION 
OF PROTEIN SEQUENCE SUCH AS A PST- 
PROTEIN SEQUENCE TAG) 



257 



259 



FIGM- 
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301 



STORE A PREDETERMINED SET OF MASS/CHARGE 
(M/Z) VALUES FOR AMINO ACID SEQUENCES 
(e.g. STORE M/Z VALUES FOR ALL POSSIBLE 
EXPECTED FRAGMENTS OF LABELLED TERMINAL 
PORTIONS OF ALL POSSIBLE PROTEINS) 



DETERMINE AN ABUNDANCE VALUE FROM SAID 
MASS SPECTRUM DATA FOR EACH M/Z VALUE 
IN THE PREDETERMINED SET OF M/Z VALUES 



303 



CALCULATE A FIRST RANKING (e.g. 
PROBABILITY), BASED ON THE ABUNDANCE 
VALUES, FOR EACH SEQUENCE OF A SET OF 
AMINO ACID (AA) SEQUENCES HAVING A 
FIRST NUMBER OF AAs (e.g. 3 AAs) 



305 



CALCULATE A SECOND RANKING (e.g. 
PROBABILITY), BASED ON THE ABUNDANCE 
VALUES, FOR EACH SEQUENCE OF A SET OF 
AA SEQUENCES HAVING A SECOND 
NUMBER OF AAs (e.g. 4 AAs) 








309 


CALCULATE A CUMULATIVE RANKING (e.g. A 
CUMULATIVE PROBABILITY), BASED ON THE 
FIRST RANKING AND THE SECOND RANKING, 

FOR EACH SEQUENCE OF A SET OF AA 
SEQUENCES HAVING AT LEAST THE SECOND 
NUMBER OF AAs 



F/G.V3 



FIG. MIA 



351 



FOR EACH POSSIBLE EXPECTED FRAGMENT OF A 
TERMINALLY LABELLED SEQUENCE OF AAs, PERFORM 
LOOKUP IN MASS SPECTRUM DATA FOR ABUNDANCE 
VALUE AT M/Z VALUE OF POSSIBLE EXPECTED FRAGMENT 
(e.g. C W = LOOKUP [(M/Z) hjXl ]) 



,353 



DETERMINE MASTER COUNT. FOR EACH PARTICULAR 
POSSIBLE SEQUENCE, OVER ALL POSSIBLE ION 
TYPES AND CHARGE STATES FOR PARTICULAR SEQUENCE 



e.g. 



MAX 

CHARGE 

STATES 



MAX 
ION 
TYPES 



WHERE 



l=MIN 

CHARGE 

STATES 



i=NUMBER OF AMINO ACIDS (AAs) IN SEQUENCE 
j=NUMBER OF POSSIBLE SEQUENCES (USUALLY, 19') 
l=NUMBER OF ION TYPES FOR EACH RESIDUE 
(e.g. ION TYPES a,fc, AND POSSIBLY c AT 
N TERMINUS AND ION TYPES x,y, AND POSSIBLY 
z AT C TERMINUS) 
k=NUMBER OF CHARGE STATES FOR EACH ION TYPE 



FOR EACH MASTER COUNT (cffj) FOR A PARTICULAR 
POSSIBLE SEQUENCE AT A GIVEN SEQUENCE LENGTH, 
^PA/?E THE MASTER COUNT TO ALL OTHER MASTER 
COUNTS FOR ALL OTHER POSSIBLE SEQUENCES FOR 
THE GIVEN SEQUENCE LENGTH TO PRODUCE A RANKING 
OF MASTER COUNTS OF ALL POSSIBLE SEQUENCES AT 
THE GIVEN LENGTH OF AAs. FOR EXAMPLE, THE 
COMPARISON MAY BE A RANKING OF PROBABILITIES 
WHERE EACH PROBABILITY IS A PROBABILITY OF A 
PARTICULAR SEQUENCE AT THE GIVEN SEQUENCE 
LENGTH RELATIVE TO ALL OTHER POSSIBLE 
SEQUENCES AT THE SAME LENGTH 



TO F7C?.[4b 



FIG. \H B 



359 





IN ONE EXAMPLE, DETERMINE Pi (A FIRST 
RANKING) EACH PARTICULUAR POSSIBLE SEQUENCE 
j AT A GIVEN AA LENGTH i: 




Pi-NORMAL DISTRIBUTION °' I 




WHERE: 


i i i 










AND 






"1 





DETERMINE CUMULATIVE RANKING FOR A GIVEN 
SEQUENCE OVER DIFFERENT SEQUENCE 
LENGTHS — e.g. CALCI/L47F 



fy" 8 * 3T, p i OR ~% Pi 



e.g. 



PTcM. a l-tpfr^ "1 _ CUMULATIVE 



36f 



SELECT SEQUENCE WITH HIGHEST 
CUMULATIVE RANKING 



363 



DETERMINE MOLECULAR WEIGHT (MW) OF A 
PARTICULAR RESIDUE/SEQUENCE (e.g. LABEL 
-Ala WHICH IS i=1, j=Ala) FROM NUMBER 
OF ATOMS (N) OF EACH ELEMENT AND MW 
OF ELEMENT — e.g. 

Mw seq =Nc* MW C + %* MW H +Ntfx MW N +... 
(WHERE N c = NUMBER OF CARBON ATOMS IN RESIDUE 
MW C =12 (WEIGHT OF CARBON), etc.) 



451 



DETERMINE (OR LOOKUP STORED) WEIGHT 
ADJUSTMENTS FOR ION TYPES OF THE 
PARTICULAR RESIDUE/SEQUENCE AND 
CALCULATE ADJUSTED MWs FOR EACH 
DIFFERENT ION TYPE FROM MW AND 

WEIGHT ADJUSTMENT 
(e.g. ADJ a MW seq =MW S eq-<3(adj); 
ADJbMW$eq =MW S eq -bja$:) 



453 



DETERMINE (OR LOOK UP) POSSIBLE CHARGE 
STATE ADJUSTMENTS FOR B 



EACH ADJ a MW S eq 
ADJ b MW seq , etc. AND CALCULATE CORRES- 
PONDING M/Zs FROM THE ADJ a MW sea , etc. 
DATA AND THE CHARGE STATE ADJUSTMENT 
DATA — e.g. 
M/Z=ADJ Q MW seq + Zj MHrf 

e.g. t THIS PRODUCES A SET OF M/Z VALUES 
FOR j=Ala FOR ALL POSSIBLE k 
AND I COMBINATIONS 











ori »c II CMrUrWKiL 1) IN LI UK LZ U\UHt 
(OR jnP S SCRATCH PAD MEMORY OF POSSIBLE) 
CURRENT SET OF M/Z VALUES 



455 
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TO FIG.\\%& 



FROM FlG.Vnft 
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PERFORM LOOKUP IN MASS SPECTRUM DATA 
AT EACH M/Z VALUE IN THE SAVED CURRENT 
SET OF M/Z VALUES TO OBTAiN ABUNDANCE 
VALUE AT EACH M/Z 



461 



ERASE CURRENT SET OF M/Z VALUES IN 
CACHE (NOTE: MAY ERASE BY WRITING 
NEW CURRENT SET IN LATER ITERATION) 



REPEAT M/Z CALCULATIONS FOR NEXT POSSIBLE 

SEQUENCE (AND REPEAT M/Z CALCULATIONS 
FOR ALL OTHER POSSIBLE TERMINAL SEQUENCES 
UP TO n AAs IN LENGTH) 
(RETURN TO 451) 



FIG. I8B 



ACCUMULATE A COUNT SUM AS DETERMINE EACH 
COUNT (CyxO ™> w LOOKUP OPERATION 
(e.g. Sl)Wf« PRIOR SUMf+Cuxi) 
FOR ALL POSSIBLE SEQUENCES OF A NUMBER/ 
LENGTH OF AAs (e.g. i=4 AAs) 



501 



503 



o 



ACCUMULATE A (C0UNT) 2 SUM AS DETERMINE 
EACH COUNT (C i]kl ) FROM LOOKUP OPERATION 
(e.g. SUMSQi=PRIOR SUMSQj+ (C hj%k$f ) 2 ) 

FOR ALL POSSIBLE SEQUENCES OF THE ' NUMBER/ 
LENGTH OF AAs (I) 



AFTER ITERATING THROUGH EACH LOOKUP 
OPERATION AT ALL POSSIBLE M/Z VALUES 
IN SET OF M/Z VALUES, 
CALCULATE MEAN (C;) AND STANDARD 
DEVIAVON (<r f ) FOR GIVEN NUMBER/LENGTH (i) 

SJJMSQj-^f 

19 1 ' 1 n-1 



FOR ALL POSSIBLE AA SEQUENCE OF LENGTH /, 
PERFORM LOOKUP OPERATIONS FOR ALL 
POSSIBLE M/Z VALUES FOR THAT POSSIBLE 
AA SEQUENCE AND DETERMINE RANKING FOR 
THAT SEQUENCE (e.g. RANKING-Pfj , WHERE 



71 —PROBABILITY DISTRIBUTION 
v FUNCTION 



AND SAVE EACH RANKING (e.g. IN NON-VOLATILE 
MEMORY) IN ORDER TO CALCULATE CUMALATNE 
RANKINGS 



FIG. \<\ 
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FIG, 20 A 
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500 
m/z (amu) 





-Baselined Mass Spectrum 
-N-terminal b-ion Mass Positions 



m/z (amu) 



It 



5-Br-3-PAA-GLSDGE 
(true 6 residue sequence) 




VL(y. 33 




VX6. ^ 



5 TCAGTGCTGCTGCAACAT pTTACAGGAAA^ ~' 

'unknown' sequence 



Ml 3 primer 



W 



AGTCACGACGACGTTGTrA 
-TCAGTGCTGCTGCAACATGTTACAGGAAAT ' 



: 5 



DNA polymerase 
dNTP mix ^7 4. 

ddATP* (see Fig +2^) 
ddGTP* (see Fig Yt^¥) 
!>1B 

RNAse; denaturation 



^ primer 



removal removal* 



CAATGTCCTTTA* 

CAATG* 

CAA* 

CA* 



MS 



DNA polymerase 
dNTP mix j 7c 
ddTTP* (see Fig 
ddCTP* (see Fig 

3~7£> 

RNAse; denaturation 



CAATGTCCTTT* 

CAATGTCCTT* 

CAATGTCCT* 

CAATGTCC* 

CAATGTC* 

CAATGT* 

CAAT* 

C* 



MS 

analysis 
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